As the number of heterogenous IP-connected devices and traffic volume increase, so does the potential for security breaches. The undetected exploitation of these breaches can bring severe cybersecurity and privacy risks. Anomaly-based \acp{IDS} play an essential role in network security. In this paper, we present a practical unsupervised anomaly-based deep learning detection system called ARCADE (Adversarially Regularized Convolutional Autoencoder for unsupervised network anomaly DEtection). With a convolutional \ac{AE}, ARCADE automatically builds a profile of the normal traffic using a subset of raw bytes of a few initial packets of network flows so that potential network anomalies and intrusions can be efficiently detected before they cause more damage to the network. ARCADE is trained exclusively on normal traffic. An adversarial training strategy is proposed to regularize and decrease the \ac{AE}'s capabilities to reconstruct network flows that are out-of-the-normal distribution, thereby improving its anomaly detection capabilities. The proposed approach is more effective than state-of-the-art deep learning approaches for network anomaly detection. Even when examining only two initial packets of a network flow, ARCADE can effectively detect malware infection and network attacks. ARCADE presents 20 times fewer parameters than baselines, achieving significantly faster detection speed and reaction time.
translated by 谷歌翻译
In intensively managed forests in Europe, where forests are divided into stands of small size and may show heterogeneity within stands, a high spatial resolution (10 - 20 meters) is arguably needed to capture the differences in canopy height. In this work, we developed a deep learning model based on multi-stream remote sensing measurements to create a high-resolution canopy height map over the "Landes de Gascogne" forest in France, a large maritime pine plantation of 13,000 km$^2$ with flat terrain and intensive management. This area is characterized by even-aged and mono-specific stands, of a typical length of a few hundred meters, harvested every 35 to 50 years. Our deep learning U-Net model uses multi-band images from Sentinel-1 and Sentinel-2 with composite time averages as input to predict tree height derived from GEDI waveforms. The evaluation is performed with external validation data from forest inventory plots and a stereo 3D reconstruction model based on Skysat imagery available at specific locations. We trained seven different U-net models based on a combination of Sentinel-1 and Sentinel-2 bands to evaluate the importance of each instrument in the dominant height retrieval. The model outputs allow us to generate a 10 m resolution canopy height map of the whole "Landes de Gascogne" forest area for 2020 with a mean absolute error of 2.02 m on the Test dataset. The best predictions were obtained using all available satellite layers from Sentinel-1 and Sentinel-2 but using only one satellite source also provided good predictions. For all validation datasets in coniferous forests, our model showed better metrics than previous canopy height models available in the same region.
translated by 谷歌翻译
Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, where unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the 'ground-truth' label annotations. This is particularly important in the context of medical image segmentation of pathological structures (e.g. lesions), where the annotation process is much more subjective, and affected by a number underlying factors, including the annotation protocol, rater education/experience, and clinical aims, among others. In this paper, we show that modeling annotation biases, rather than ignoring them, poses a promising way of accounting for differences in annotation style across datasets. To this end, we propose a generalized conditioning framework to (1) learn and account for different annotation styles across multiple datasets using a single model, (2) identify similar annotation styles across different datasets in order to permit their effective aggregation, and (3) fine-tune a fully trained model to a new annotation style with just a few samples. Next, we present an image-conditioning approach to model annotation styles that correlate with specific image features, potentially enabling detection biases to be more easily identified.
translated by 谷歌翻译
网络威胁情报(CTI)共享是减少攻击者和捍卫者之间信息不对称的重要活动。但是,由于数据共享和机密性之间的紧张关系,这项活动带来了挑战,这导致信息保留通常会导致自由骑士问题。因此,共享的信息仅代表冰山一角。当前的文献假设访问包含所有信息的集中数据库,但是由于上述张力,这并不总是可行的。这会导致不平衡或不完整的数据集,需要使用技术扩展它们。我们展示了这些技术如何导致结果和误导性能期望。我们提出了一个新颖的框架,用于从分布式数据中提取有关事件,漏洞和妥协指标的分布式数据,并与恶意软件信息共享平台(MISP)一起证明其在几种实际情况下的使用。提出和讨论了CTI共享的政策影响。拟议的系统依赖于隐私增强技术和联合处理的有效组合。这使组织能够控制其CTI,并最大程度地减少暴露或泄漏的风险,同时为共享的好处,更准确和代表性的结果以及更有效的预测性和预防性防御能力。
translated by 谷歌翻译
发现预测未来疾病结果的患者特定成像标记可以帮助我们更好地了解疾病进化的个体水平异质性。实际上,可以在医学实践中采用的可以提供数据驱动的个性化标记的深度学习模型。在这项工作中,我们证明了数据驱动的生物标志物发现可以通过反事实综合过程来实现。我们展示了如何使用深层的条件生成模型来扰动基线图像中的局部成像特征,这些图像与特定于受试者的未来疾病进化有关,并导致反事实图像有望具有不同的未来结果。因此,候选生物标志物是由于检查了此过程中受到干扰的一组功能而产生的。通过对大型多扫描仪多中心多发性硬化症(MS)临床试验磁共振成像(MRI)数据集(RRMS)患者数据集(RRMS)患者数据集进行的几项实验,我们证明我们的模型会产生反面的反面事件,并具有成像变化反映了建立的临床标记的特征,可预测人群水平的未来MRI病变活性。其他定性结果表明,我们的模型有可能发现未来活动的新颖和主题的预测标记。
translated by 谷歌翻译
从样本中学习概率分布的任务在整个自然科学中无处不在。局部量子电路的输出分布构成了一类特别有趣的分布类别,对量子优势提案和各种量子机学习算法都具有关键的重要性。在这项工作中,我们提供了局部量子电路输出分布的可学习性的广泛表征。我们的第一个结果可以深入了解这些分布的有效学习性与有效的可模拟性之间的关系。具体而言,我们证明与Clifford电路相关的密度建模问题可以有效地解决,而对于深度$ d = n^{\ omega(1)} $电路,将单个$ t $ gate注入到电路中,这使这是如此问题很难。该结果表明,有效的模拟性并不意味着有效的可学习性。我们的第二组结果提供了对量子生成建模算法的潜在和局限性的见解。我们首先证明与深度$ d = n^{\ omega(1)} $局部量子电路相关的生成建模问题对于任何学习算法,经典或量子都很难。结果,一个人不能使用量子算法来为此任务获得实际优势。然后,我们证明,对于各种最实际相关的学习算法(包括混合量词古典算法),即使是与深度$ d = \ omega(\ log(n))$ Clifford Circuits相关的生成建模问题也是如此难的。该结果对近期混合量子古典生成建模算法的适用性造成了限制。
translated by 谷歌翻译
本文介绍了一种基于变压器深度学习模型为视频游戏生成音乐的体系结构。该系统按照设计视频游戏音乐目前使用的标准分层策略来生成各种层的音乐。根据唤醒现象模型,音乐对玩家的心理环境具有适应性。我们的动机是根据玩家的口味自定义音乐,他们可以通过一系列音乐示例选择他喜欢的音乐风格。我们讨论了未来的当前局限性和前景,例如对音乐组件的协作和互动控制。
translated by 谷歌翻译
本文的目的是通过应用深度学习方法作为计算最佳资本分配策略的工具来研究系统性风险措施的新方法学框架。在这个新框架下,可以将系统性风险措施解释为通过在汇总单个风险之前将资本分配给单个机构来确保总体机构确保汇总系统的最低现金。除了在非常有限的情况下,此问题没有明确的解决方案。深度学习在财务模型和风险管理中越来越受到关注,我们建议我们基于深度学习的算法解决风险措施的原始问题和双重问题,从而学习公平的风险分配。特别是,我们的双重问题方法涉及受众所周知的生成对抗网络(GAN)方法的启发,以及对radon-Nikodym衍生产品的新设计的直接估计。我们通过对该主题进行大量数值研究结束了论文,并提供了与系统性风险措施相关的风险分配的解释。在指数偏好的特定情况下,与最佳显式溶液作为基准相比,数值实验表明了所提出的算法的出色性能。
translated by 谷歌翻译
在本文中,我们研究了DNN培训中量化的影响。我们假设重量量化是正则化的一种形式,正则化的量与量化水平(精度)相关。我们通过提供分析研究和经验结果来证实我们的假设。通过将重量量化为重量噪声的一种形式,我们探讨了该噪声在训练时如何通过网络传播。然后,我们表明该噪声的大小与量化水平相关。为了确认我们的分析研究,我们在本文中进行了广泛的实验列表,其中我们表明,在各种数据集中,在各种视觉任务和模型中可以看到量化的正则化效果。基于我们的研究,我们建议8位量化在不同的视觉任务和模型中提供了一种可靠的正则化形式。
translated by 谷歌翻译
慢性疾病(例如多发性硬化症(MS))的精密医学涉及选择一种治疗方法,该治疗能够最好地平衡疗效和副作用/偏好。尽早做出这种选择很重要,因为寻找有效疗法的延迟可能会导致不可逆的残疾应计。为此,我们介绍了第一个针对MS患者的基线磁共振成像(MRI)(MRI)(MRI)(MRI)(MRI)的第一个深层神经网络模型。我们的模型(a)预测未来的新和扩大的T2加权(NE-T2)病变对多种治疗的随访MRI进行计数,并且(b)估计有条件的平均治疗效果(CATE),如预测的未来抑制NE所定义-t2病变,相对于安慰剂的不同治疗选择。我们的模型在四个多中心随机临床试验中从MS患者中获得的1817个多序列MRI的专有联合数据集进行了验证。我们的框架在未来NE-T2病变的二进制回归中达到了五种不同治疗的二进制回归,确定了异质治疗效果,并提供了个性化治疗建议,以说明治疗相关风险(例如,副作用,患者偏好,管理困难) 。
translated by 谷歌翻译